3D Scene Scanner and Position and Orientation System

ABSTRACT

A hand-held mobile 3D scanner ( 10 ) for scanning a scene. The scanner ( 10 ) comprises a range sensor ( 11 ) that is arranged to sense the location of surface points in the scene relative to the scanner ( 10 ) and generate representative location information, a texture sensor ( 12 ) that is arranged to sense the texture of each surface point in the scan of the scene and generate representative texture information, and a position and orientation sensor ( 13 ) that is arranged to sense the position and orientation of the scanner ( 10 ) during the scan of the scene and generate representative position and orientation information. A control system ( 14 ) is also provided that is arranged to receive the information from each of the sensors and generate data representing the scan of the scene.

FIELD OF THE INVENTION

The present specification relates to a three-dimensional (3D) scenescanner for scanning a scene. In particular, although not exclusively,the scanner can be utilised to capture data for generating 3Dphoto-realistic representations or 3D computer models of static objectsor environments over wide areas, whether outdoor or indoor. The presentspecification also relates to an associated position and orientationsystem.

BACKGROUND TO THE INVENTION

Various types of 3D scanners are available, each more suited to specificapplications, e.g. for scanning small objects with high resolution orfor scanning large objects with low resolution. To scan all around anobject requires that either the object is moved past the scanner, e.g.on a turntable, or the scanner moved around the object.

Several types of known scanners are capable of capturing completesurface information of objects and scenes. Generally, these scanners canbe separated into three categories: namely photogrammetric scanners,fixed station laser scanners and hand-held 3D shape scanners. Thescanners generate data points or other structures representing the sceneor object scanned and this data can be post-processed by software toallow visualisation and to generate 3D representations or 3D computermodels of the scene or object.

Photogrammetric systems reconstruct a 3D scene or object based onanalysis of multiple overlapping 2D images. Provided common features arevisible and identified in the images and camera calibration parametersare known or determined, it is possible to extract 3D metric scene orobject information. In some cases, the cameras are pro-calibrated. Inother cases, self-calibration is attempted based on the image matches.

Fixed station scanners scan a scene from a fixed location. Typically,fixed station scanners are arranged to scan a modulated laser beam intwo dimensions and acquire range information by measuring thephase-shift of the reflected modulated laser light or the time-of-flightof a reflected laser pulse. By panning the scanner through 360°, it ispossible to produce a 360°panoramic range map of the scene. To scan acomplete scene often requires moving the fixed station scanner to anumber of different scanning locations. Depending on the size of scene,scanning time is typically 10-30 minutes. Some fixed station scannersalso comprise a digital camera that is arranged to capture colourinformation for each surface point in the scan of the scene so that dualcolour and range images can be generated. Other fixed station scannersincorporate multiple lasers to allow acquisition of colour as well asrange information.

Hand-held 3D shape scanners comprise a hand-held mobile scanner-headthat is commonly maneuvered by a user about the object being scanned.Typically, the scanner-head includes a range sensor for determining thelocal shape of the object by sensing the position in space of thesurface points of the object relative to the scanner-head. For example,the range sensor may sense the position in space of the surface pointsvia laser triangulation. The hand-held 3D shape scanners also comprise aposition and orientation system that measures the position andorientation of the mobile scanner-head in space during the scan. Thelocal shape information is then coupled with the scanner-head positionand orientation information to enable a 3D computer model of the objectto be constructed.

In this specification where reference has been made to patentspecifications, other external documents, or other sources ofinformation, this is generally for the purpose of providing a contextfor discussing the features of the invention. Unless specifically statedotherwise, reference to such external documents is not to be construedas an admission that such documents, or such sources of information, inany jurisdiction, are prior art, or form part of the common generalknowledge in the art.

It is an object of the present invention to provide a flexible andportable 3D scene scanner that is capable of scanning wide-area scenes,or to provide a position and orientation system that is capable ofsensing the pose of a mobile object in 6D, or to at least provide thepublic with a useful choice.

SUMMARY OF THE INVENTION

In a first aspect, the present invention broadly consists in a hand-heldmobile 3D scanner for scanning a scene comprising: a range sensor thatis arranged to sense the location of surface points in the scenerelative to the scanner and generate representative locationinformation; a texture sensor that is arranged to sense the texture ofeach surface point in the scan of the scene and generate representativetexture information; a position and orientation sensor that is arrangedto sense the position and orientation of the scanner during the scan ofthe scene by interacting with multiple reference targets located aboutthe scene and generate representative position and orientationinformation; and a control system that is arranged to receive theinformation from each of the sensors and generate data representing thescan of the scene.

Preferably, the data generated relates to scanned surface points in thescene and may comprise information on the 3D positions of those surfacepoints in space and texture information in relation to those surfacepoints. More preferably, the data may further comprise viewpointinformation in relation to the viewpoint from which the surface pointswere scanned by the scanner.

Preferably, the control system may be arranged to generate the textureinformation for the data based on texture values sensed by the texturesensor from multiple viewpoints during the scan.

Preferably, the control system may be arranged to generate a texturemodel representing the scan of the scene.

Preferably, the data generated by the control system may be in the formof rich-3D data.

Preferably, the control system may be arranged to generate a 3Dsubstantially photo-realistic representation of the scene from the data.

Preferably, the control system may comprise a user interface that isoperable by a user to control scanning parameters.

Preferably, the control system may comprise an output display that isarranged to generate a progressive representation of the scene as it isbeing scanned.

Preferably, the control system may be arranged to filter out dataassociated with scanned surface points in the scene that fall outsidescanning zones that are selected by the user.

Preferably, the control system may be arranged to increase or decreasethe resolution of the range and texture sensors for particular scanningzones that are selected by the user.

Preferably, the range sensor may comprise any one of the following: alight detection and ranging (LIDAR) device, triangulation-based device,or a non-scanning time-of-flight camera device.

In one form, the texture sensor may comprise a colour camera that isarranged to capture digital images of the scene, each digital imagecomprising an array of pixels and each pixel or group of pixelscorresponding to a surface point in the scan of the scene from whichtexture information can be extracted. In an alternative form, thetexture sensor may comprise a multi-spectral laser imager that isarranged to sense texture information relating to the scanned surfacepoints of the scene.

In one form, the position and orientation sensor may comprise an opticaltracking device that senses the position and orientation of the scannerby tracking visible reference targets located about the scene.Preferably, the optical tracking device may comprise one or moredirection sensors that are arranged to detect visible reference targetsand generate direction information relating to the direction of thevisible reference targets relative to the scanner, the optical trackingdevice processing the direction information to determine the positionand orientation of the scanner. More preferably, the direction sensorsmay be optical sensors that are each arranged to view outwardly relativeto the scanner to provide direction information relating to any visiblereference targets.

The position and orientation sensor may additionally comprise aninertial sensor that is arranged to sense the position and orientationof the scanner and provide representative position and orientationinformation if the optical tracking device experiences target dropout.

In a second aspect, the present invention broadly consists in a portable3D scanning system for scanning a scene comprising: a hand-held mobilescanner comprising: a range sensor that is arranged to sense thelocation of surface points in the scene relative to the scanner andgenerate representative location information; a texture sensor that isarranged to sense the texture of each surface point in the scan of thescene and generate representative texture information; and a positionand orientation sensor that is arranged to sense the position andorientation of the scanner during the scan of the scene and generaterepresentative position and orientation information; multiple referencetargets for placing randomly about the scene, the position andorientation sensor interacting with detectable reference targets tosense the position and orientation of the scanner; and a control systemthat is arranged to control the scanner and its sensors and thereference targets, receive the information from each of the sensors, andgenerate data representing the scan of the scene.

Preferably, the data generated relates to scanned surface points in thescene and may comprise information on the 3D positions of those surfacepoints in space and texture information in relation to those surfacepoints. More preferably, the data may further comprise viewpointinformation in relation to the viewpoint from which the surface pointswere scanned by the scanner.

Preferably, the control system may be arranged to generate the textureinformation for the data based on texture values sensed by the texturesensor from multiple viewpoints during the scan.

Preferably, the control system may be arranged to generate a texturemodel representing the scan of the scene.

Preferably, the data generated by the control system may be in the formof rich-3D data.

Preferably, the control system may be arranged to generate a 3Dsubstantially photo-realistic representation of the scene from the data.

Preferably, the control system may comprise a user interface that isoperable by a user to control scanning parameters.

Preferably, the control system may comprise an associated output displaythat is arranged to generate a progressive representation of the sceneas it is being scanned.

Preferably, the control system may be arranged to filter out dataassociated with scanned surface points in the scene that fall outsidescanning zones that are selected by the user.

Preferably, the control system may be arranged to increase or decreasethe resolution of the range and texture sensors for particular scanningzones that are selected by the user.

Preferably, the range sensor may comprise any one of the following: alight detection and ranging (LIDAR) device, triangulation-based device,or a non-scanning time-of-flight camera device.

In one form, the texture sensor may comprise a colour camera that isarranged to capture digital images of the scene, each digital imagecomprising an array of pixels and each pixel or group of pixelscorresponding to a surface point in the scan of the scene from whichtexture information can be extracted. In another form, the texturesensor may comprise a multi-spectral laser imager that is arranged tosense texture information relating to the scanned surface points of thescene.

In one form, the position and orientation sensor may comprise an opticaltracking device that senses the position and orientation of the scannerby tracking visible reference targets located about the scene.Preferably, the optical tracking device may comprise one or moredirection sensors that are arranged to detect visible reference targetsand generate direction information relating to the direction of thevisible reference targets relative to the scanner, the optical trackingdevice processing the direction information to determine the positionand orientation of the scanner. More preferably, the direction sensorsmay be optical sensors that are each arranged to view outwardly relativeto the scanner to provide direction information relating to any visiblereference targets.

In one form, the position and orientation sensor may additionallycomprise an inertial sensor that is arranged to sense the position andorientation of the scanner and provide representative position andorientation information if the optical tracking device experiencestarget dropout.

In a third aspect, the present invention broadly consists in a method ofscanning a scene comprising the steps of: operating a hand-held mobilescanner to scan the scene, the scanner comprising: a range sensor thatis arranged to sense the shape of the object(s) in the scene on asurface point-by-point basis and generate representative shapeinformation; a texture sensor that is arranged to sense the texture ofthe object(s) in the scene on a surface point-by-point basis andgenerate representative texture information; and a position andorientation sensor that is arranged to sense the position andorientation of the scanner in a local reference frame by interactingwith multiple reference targets located about the scene and generaterepresentative position and orientation information; obtaining theshape, texture, and position and orientation information from thesensors; processing the shape, texture, and position and orientationinformation; and generating data representing the scan of the scene.

Preferably, the step of processing the shape, texture, and position andorientation information may comprise extracting information about eachsurface point of the surfaces and objects in the scan on apoint-by-point basis by computing the 3D position of each surface pointin the local reference frame from the shape information and the positionand orientation information; generating the texture information aroundthe region of each surface point from the texture information from thetexture sensor and the position and orientation information; andextracting the viewpoint from which the surface point was scanned by thescanner from the position and orientation information.

Preferably, the step of generating data representing the scene maycomprise constructing rich-3D data.

Preferably, the method may further comprise the step of processing thedata to generate a 3D substantially photo-realistic representation ofthe scene for display.

Preferably, the method may further comprise the step of placingreference targets about the scene and operating the position andorientation sensor of the scanner to interact with detectable referencetargets to sense the position and orientation of the scanner in thelocal reference frame.

Preferably, the step of operating the hand-held mobile scanner to scanthe scene may comprise scanning the surfaces and objects of the scenefrom multiple viewpoints.

Preferably, the step of operating the hand-held mobile scanner to scanthe scene may comprise first initially setting scanning zones within thescene such that processing step discards any information relating tosurface points of objects in the scene that fall outside the scanningzones.

In a fourth aspect, the present invention broadly consists in a mobile3D scene scanner for scanning a scene comprising: a range sensor,arranged to sense the shape of the object(s) in the scene on a surfacepoint-by-point basis and generate representative shape information; atexture sensor arranged to sense the texture of the object(s) in thescene on a surface point-by-point basis and generate representativetexture information; a position and orientation sensor arranged to sensethe position- and orientation of the scanner in a local reference frameby interacting with multiple reference targets located about the sceneand generate representative position and orientation information; and acontrol system arranged to control each of the sensors, receive theinformation from each of the sensors, and generate data representing thesurface points of the object(s) scanned in the scene.

Preferably, the data generated by the control system may be in the formof rich-3D data.

Preferably, the range sensor may comprise any one of the following: alight detection and ranging (LIDAR) device, triangulation-based device,or a non-scanning time-of-flight camera device.

In one form, the texture sensor may comprise a colour camera that isarranged to capture digital images of the scene, each digital imagecomprising an array of pixels and each pixel or group of pixelscorresponding to a surface point of an object in the scan of the scenefrom which texture information can be extracted. In another form, thetexture sensor may comprise a multi-spectral laser imager that isarranged to sense texture information relating to the object(s) in thescene on a surface point-by-point basis.

In a fifth aspect, the present invention broadly consists in a positionand orientation system for sensing the position and orientation of amobile object that is moveable in an environment comprising: multiplereference targets locatable in random positions within the environmentto define a local reference frame; an optical tracking device mounted tothe mobile object comprising one or more direction sensors that arearranged to detect visible reference targets and generate directioninformation relating to the direction of the visible reference targetsrelative to the optical tracking device; and a control system arrangedto operate the optical tracking device, receive the directioninformation, and process the direction information to initiallydetermine the 3D positions of the reference targets and then generateposition and orientation information relating to the position andorientation of the mobile object in the local reference frame duringmovement of the mobile object.

Preferably, the direction sensors of the optical tracking device may beoptical sensors that are each arranged to view outwardly relative to thescanner to provide direction information relating to any visiblereference targets. More preferably, the optical sensors may comprise anarrangement of cameras that view outwardly relative to the mobile objectto provide direction information relating to any visible referencetargets.

Preferably, the one or more direction sensors of the optical trackingdevice may be arranged to form an omnidirectional direction sensor.

Preferably, the reference targets may each comprise a switchable lightsource that is arranged to emit light for sensing by the directionsensors of the optical tracking device.

Preferably, the control system is arranged to auto-calibrate inoperation by automatically determining the 3D position of visiblereference targets in the environment by processing direction informationfrom the optical tracking device. More preferably, the control systemmay be arranged to auto-calibrate at start-up and continue toperiodically auto-calibrate during operation to register the movement,removal, and addition of reference targets to the environment.

Preferably, the control system may be arranged to provide the user withfeedback on the quality of the distribution of the reference targetswithin the environment after auto-calibration has taken place, thedistribution of the reference targets affecting the accuracy of theposition and orientation information generated.

Preferably, the control system may comprise a user interface that isoperable by a user to control the system and an associated outputdisplay for presenting the position and orientation information.

Preferably, the position and orientation system may further comprise aninertial sensor that is arranged to sense the position and orientationof the mobile object and provide representative position and orientationinformation if the optical tracking device experiences target dropout.

In a sixth aspect, the present invention broadly consists in a method ofsensing the position and orientation of a mobile object that is moveablein an environment comprising the steps of: placing multiple referencetargets at random positions within the environment to define a localreference frame; mounting an optical tracking device to the mobileobject, the optical tracking device comprising one or more directionsensors that are arranged to detect visible reference targets andgenerate direction information relating to the direction of the visiblereference targets relative to the optical tracking device; operating theoptical tracking device to track and sense visible reference targets asit moves with the mobile object in the environment and generatedirection information; and processing the direction information toinitially determine the 3D positions of the reference targets and thengenerate position and orientation information relating to the positionand orientation of the mobile object in the local reference frame duringmovement of the mobile object.

Preferably, the step operating the optical tracking device to track andsense visible reference targets may further comprise the step ofoperating the optical tracking device to initially determine the 3Dposition of visible reference targets in the environment byauto-calibrating from the direction information.

Preferably, the step of auto-calibrating may comprise: moving the mobileobject into N locations in the environment and sensing directioninformation for visible reference targets at each location; calculatinginitial estimates of the position and orientation of the mobile objectat the N locations; calculating accurate estimates of the position andorientation of the mobile object; and reconstructing the referencetarget 3D positions by triangulation using the direction information andthe accurate estimates of the position and orientation of the mobileobject. In one form, the step of calculating initial estimates maycomprise executing a closed form algorithm. In one form, the step ofcalculating accurate estimates may comprise executing a non-linearminimisation algorithm.

Preferably, the step of auto-calibrating occurs periodically to registerthe movement, removal, and addition of reference targets.

Preferably, the method of sensing the position and orientation of amobile object may further comprise the step of feeding back informationon the quality of the distribution of the reference targets within theenvironment after auto-calibration step has taken place, thedistribution of the reference targets affecting the accuracy of theposition and orientation information generated.

Preferably, the step of processing the direction information to generateposition and orientation information may comprise: calculating aninitial estimate of the position and orientation of the mobile objectusing a boot-strapping process; predicting the current position andorientation of the mobile object based on previous position andorientation estimate; associating the sensed direction information withspecific individual reference target 3D positions; and updating thecurrent position and orientation prediction using the individualreference target 3D positions and direction information. In one form,the step of predicting the current position and orientation of themobile object may comprise extrapolating from the previous position andorientation estimate. In one form, the step of updating the currentposition and orientation predication may comprise executing a non-linearalgorithm.

Preferably, the method of sensing the position and orientation of amobile object may further comprise mounting an inertial sensor to themobile object and operating the inertial sensor to sense the positionand orientation of the mobile object and generate representativeposition and orientation information if the optical tracking deviceexperiences target dropout.

In this specification and the accompanying claims, the term “texture” isintended to cover any information relating to the surface textureincluding, but not limited to, colour, such as hue, brightness andsaturation, or grey-scale intensity information.

In this specification and the accompanying claims, the term “scene” isintended to cover any indoor or outdoor environment, surfaces andobjects together within such environments, and also individual objectsin isolation within such environments.

In this specification and the accompanying claims, the term “portable”in the context of a system is intended to cover any system that hascomponents that may be packed into a carry-case or that are relativelyeasily transportable to different locations.

Unless the context requires otherwise, the term “targets” in thisspecification and the accompanying claims is intended to cover anypowered or non-powered object, device, marker, landmark, beacon, patternor the like.

In this specification and the accompanying claims, the phrase “visiblereference targets” is intended to cover reference targets that arevisible to the optical tracking device of the position and orientationsystem in that the targets are able to be sensed and not occluded.

In this specification and the accompanying claims, the phrase “surfacepoints” is intended to refer to the points or patches on the surface ofobjects and surroundings within a scene being scanned.

In this specification and the accompanying claims, the phrase “rich-3Ddata” is intended to capture 3D point cloud data in which each surfacepoint or patch has associated viewpoint information and texture valuesobtained from different viewpoints or an associated texture modelconstructed from texture information sensed from multiple viewpointsduring the scan for that surface point or patch.

The term ‘comprising’ as used in this specification and claims means‘consisting at least in part of’, that is to say when interpretingstatements in this specification and claims which include that term, thefeatures, prefaced by that term in each statement, all need to bepresent but other features can also be present.

The invention consists in the foregoing and also envisages constructionsof which the following gives examples only.

BRIEF DESCRIPTION OF THE DRAWINGS

Preferred embodiments of the 3D scene scanner and the position andorientation system will be described by way of example only and withreference to the drawings, in which:

FIG. 1 shows a schematic diagram of the main modules of the preferredform 3D scene scanner;

FIG. 2 shows a perspective view of a hand-held form of the 3D scenescanner;

FIG. 3 shows a schematic diagram of the coordinate systems employed bythe 3D scene scanner algorithms;

FIG. 4 shows a schematic diagram of the functional architecture employedby the 3D scene scanner;

FIG. 5 shows a schematic diagram of the 3D scene scanner in use at thescene of a car crash;

FIG. 6 shows a perspective view of one form of optical tracking deviceof the position and orientation system;

FIG. 7 shows a perspective view of another form of optical trackingdevice of the position and orientation system; and

FIG. 8 shows a schematic diagram of the position and orientation systemdetecting visible reference targets in a scene in order to senseposition and orientation.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

The preferred form 3D scene scanner is a hand-held device for flexiblyscanning and capturing data for generating 3D photo-realisticrepresentations or 3D computer models of static objects or environments.In one form, the 3D scene scanner is operable to capture data forcreating 3D photographs of scenes, environments and objects, whetherlocated indoor or outdoor.

Referring to FIG. 1, the hand-held mobile 3D scene scanner 10 comprisesthree sensors, namely a range sensor 11, a texture sensor 12, and aposition and orientation sensor 13. Each of the sensors communicateswith a control system 14 onboard the scanner 10. The control system 14comprises a CPU, microprocessor, microcontroller, PLC or other deviceand is arranged to control and operate the sensors, along with receivingand processing information generated by the sensors. In particular, thecontrol system 14 is arranged to receive information from each of thesensors 11,12,13 and process that information to generate a set of dataelements representing the scan of the scene. In the preferred form, eachdata element may comprise information on the 3D position of that surfacepoint in space, the texture values in the region of that surface point,and the position and orientation of the scanner 10 in space when thesurface point was scanned. The control system 14 preferably comprisesassociated memory and input/output modules for storing and transferringthis rich-3D data, which can be post-processed by software to allowvisualisation and to produce 3D photo-realistic representations or 3Dcomputer models of the scanned scene.

The preferred form control system 14 may communicate with other externaldevices 15, such as computers or output displays for example, via wireor wireless link to transfer data and information. It will beappreciated that communication could be via wire or wireless connection,optical fibre connection, or over any other communication medium. Thecontrol system 14 may also have an associated user interface that isoperable by a user to control the scanner 10 and manipulate the data itobtains during and after a scan of a scene. For example, a user mayoperate the user interface to alter scanning parameters, such assensitivity, resolution, gain, range filters, speed, integration time,accuracy and the like. The user interface may also be onboard thescanner 10 and may comprise a control panel having buttons, switches,dials, touch screens or the like. An output display may also beassociated with the control system for displaying preliminary or full 3Dphoto-realistic representations of the scanned scene from the scan dataduring and after the scan.

It will be appreciated that various types of control systems may beutilised with the scanner 10. While the preferred form control system isentirely onboard the scanner 10, other control systems may be partiallyonboard or may be external to the scanner 10 and may control itremotely. For example, the scanner 10 may comprise an onboard CPU thatcoalesces data and information from the three sensors 11,12,13 and thentransfers that information, via a wireless link for example, to anexternal base CPU that processes, collates and stores the scanned data.

In the preferred form, the scanner 10 has a housing that mounts each ofthe sensors 11,12,13, the control system 14, the user interface and theoutput display. The housing is arranged to be hand-held via a handlepart or the like and is freely mobile about a scene to be scanned. Itwill be appreciated that the scanner 10 may be coupled to a robot arm orother moveable mechanism for automatically scanning a scene or forassisting the user to scan a scene if desired. By way of example, FIG. 2shows a possible form of the mobile hand-held scanner 10 that includes alongitudinal handle 5 for the right hand of a user to and a transversehandle 3 for the left hand, or vice versa. A range sensor 11, texturesensor 12, user interface panel 7, inertial sensor 4, and directionsensor cameras 9 are visible in FIG. 2. The direction sensor cameras 9are part of the optical tracking device of the position and orientationsensor 13. The inertial sensor 4 is also part of the position andorientation sensor 13 and is arranged to, for example, generate poseinformation if the optical tracking device experiences target dropoutand/or enhance the robustness of the pose information generated by theoptical tracking device. The position and orientation sensor 13 will bedescribed in more detail later.

The range sensor 11 is arranged to obtain local shape information aboutsurfaces and objects 16 within a scene on a point-by-point basis. Inparticular, the range sensor 11 is arranged to measure the distance anddirection between the scanner 10 and each surface point in the scan ofthe scene and generate representative location information. The locationinformation obtained is subsequently processed with the position andorientation information from the position and orientation sensor 13 todetermine the 3D position of each surface point in space, typicallydefined by a local reference frame 17. The preferred form range sensor11 is a light detection and ranging (LIDAR) device that generates aone-dimensional (1D) laser scan and computes time-of-flight measurementsto sense the distance and direction between the scanner 10 and eachsurface point in the scan of the scene. The LIDAR device may have arange of, for example, 3 m with a 70 degree aperture and resolution of 1mm. It will be appreciated that a range sensor that generates atwo-dimensional (2D) laser scan could also be utilised if desired.Alternatively, the range sensor 11 may comprise a triangulation baseddevice. For example, the range sensor 11 may comprise a camera arrangedto view a laser light stripe generated by a laser device, the distancebetween the scanner 10 and each surface point being determined bytriangulation calculations. In a further alternative form, the rangesensor 11 may comprise a non-scanning sensor such as a time-of-flightcamera range sensor. It will be appreciated that there are various typesof range sensors that could be utilised in the scanner 10 and the rangesensor 11 may be implemented using any 3D sensing principle. The type ofrange sensor utilised will depend on the desired specification definedby, for example, range, resolution, aperture size, viewing angle,accuracy and the like. Further, it will be appreciated that variousproperties of the range sensor 11, such as resolution, accuracy andsensitivity, may be controlled by the user during a scan.

The texture sensor 12 is arranged to obtain texture information aboutsurfaces and objects 16 within a scene on a point-by-point basis. Inparticular, the texture sensor 12 is arranged to sense the texture inthe region of each surface point in the scan of the scene and generaterepresentative texture information. The texture information includesdata about the colour of each surface point scanned and may berepresented by hue, brightness and saturation data. In operation, thetexture sensor 12 determines the texture of each surface point bysensing the light reflected from the surface point. In the preferredform, the texture sensor 12 is a colour camera arranged to captureimages of the scene, the images being processed to extract textureinformation in the region of each surface point in the scan of thescene. The colour camera may be a video camera or a still-shot digitalcamera that is arranged to capture digital images of the scene, eachdigital image comprising an array of pixels and each pixel or group ofpixels corresponding to a surface point in the scan of the scene. Inoperation, the digital images are processed and the texture informationrelating to the surface points in the scan of the scene is extractedfrom the pixels of the digital images. The texture sensor data for allviewpoints is then processed with the position and orientationinformation to extract the texture information in the region of eachsurface point. It will be appreciated that any form of imaging devicehaving a colour image sensor, whether CCD, CMOS or otherwise, may beutilised as the texture sensor 12 to capture the texture information. Inalternative forms, the texture sensor 12 may be a multi-spectral laserimager arranged to sense the texture information relating to the surfacepoints in the scan of the scene.

It will be appreciated that there may not necessarily be a 1:1relationship between the surface point sensed by the range sensor 11 andthe surface point sensed by the texture sensor 12. For example, therange sensor 11 has a particular laser spot size and the rangemeasurement is derived from a particular surface patch area. The texturesensor 12 may have a different spatial resolution, so there may be manytexture measurements corresponding to a particular surface patch area.Further, the same surface patch area may be captured from differentscanner 10 positions, thereby providing more texture values. The scanner10 is arranged to process the captured data in a coherent manner toallow visualisation and to produce 3D photo-realistic representations or3D computer models of the scanned scene or object.

The position and orientation sensor 13 is arranged to continuouslymonitor the pose of the scanner 10 in space during a scan of a scene. Inparticular, the position and orientation sensor 13 is arranged to sensethe position and orientation of the scanner 10 in six degrees of freedom(6D) within a local reference frame and generate representative positionand orientation information, which is utilised in three respects. First,the position and orientation information is processed with the locationinformation from the range sensor 11 to determine the 3D position in thelocal reference frame of each surface point in the scene scanned.Second, the position and orientation information is processed with thetexture information from the texture sensor 12 to extract the texture ofeach surface point. Third, the position and orientation informationprovides information on the viewpoint of the scanner 10 when scanningeach surface point in the scan of the scene as it will be appreciatedthat each surface point may be scanned multiple times from differentviewpoints and angles to generate the rich-3D data. The position andorientation sensor 13 may comprise optical, electromagnetic, GPS orinertial devices in combination or alone. A preferred form position andorientation sensor 13 will be described in detail later.

In operation, a user moves the hand-held scanner 10 progressively aboutthe scene to scan all the desired objects and surfaces, much like aspray-paint gun would be used. As the scanner 10 is moved about thescene, the range sensor 11 and texture sensor 12 together rapidly recordboth the shape and colour of the scene, or more specifically the objectsand surfaces within the scene. Because the scanner 10 knows its positionand orientation in space, and hence its viewpoint at any instance intime, it may progressively build up data for creating a complete 3Dscene representation by moving the scanner 10 around to survey thescene.

The result of a survey of a scene is a large computer file containing adigital 3D point cloud with associated image textures and viewpointdirections. This rich-3D data may be post-processed and stored so thatit can be visualised for virtual fly-throughs, scene re-examination,measurement, copying, digital manipulation, and the like.

As mentioned, the scanner 10 preferably generates rich-3D data thatcomprises 3D point clouds in which each point has been tagged with theview direction from which it was captured by the scanner 10, along withthe associated surface colour captured from multiple viewpoints. Mostsurveys will have a lot of redundancy of surface points captured frommany different angles. The scanner 10 is capable of dealing with thisredundancy by building a texture model based on the texture dataobtained from different viewpoints. This texture model will allowsubsequent rendering of the scene from any viewpoint. The model may beprogressively constructed as data is gathered. The advantage of rich-3Ddata is that it may provide enhanced spatial accuracy and photo-realismin reconstructed scenes.

The scanner 10 is capable of scanning different sized scenes, whetherlarge or small. It is also operable to scan isolated objects of varyingsize. The capabilities of the scanner 10 in this respect are dictated bythe range, accuracy and sensitivity of the sensors and thesespecifications can be altered as desired. The preferred form scanner 10is flexible in this respect and is operable to scan a broad range ofdifferent sized scenes and objects. For example, the preferred formscanner 10 is capable of scanning objects and environments of up toabout 100 m in size (length or width), but ultimately there is no upperlimit on the scene size except for what can practically be covered bythe operator in reasonable time. The scanner 10 is operable to scan thesurfaces of objects and environments at varying ranges, the lower andupper limits of the range being dictated by the capability of thesensors. The scanner 10 is designed to scan static objects andenvironments where nothing in the scene, except the scanner 10 itself,moves during the survey. As mentioned, the scenes could be terrain orobjects of almost arbitrary complexity, and could be indoors or outdoorsor both.

Referring to FIG. 3, the coordinate systems employed by the scanner 10in operation will be described. The preferred form scanner 10 utilises aseries of coordinate transformations to generate the rich-3D datarepresenting the scan of the scene. The range 11, texture 12, andposition and orientation 13 sensors each have their own coordinatesystem, CSR, CST and CSP respectively. Through a series of calibrations,these can be related to the mobile scanner coordinate system CSM andhence spatial measurements from each sensor 11,12,13 can be transformedinto the CSM. As the mobile scanner 10 is moved around within a scene,the spatial relationship between the scanner 10 and a fixed worldcoordinate system CSW is determined by the position and orientationsensor 13. In operation, data derived from the range 11 and texture 12sensors at any point in time is transformed firstly into CSM and theninto CSW. The outputs of the position and orientation sensor 13 areinterpolated to obtain an estimate of the position and orientation ofthe mobile scanner 10 at that time. These concepts are elaborated inmore detail below with reference to the functional architecture shown inFIG. 4.

In its most generic form, the scanner 10 can have a number of differentposition and orientation subsystems, which collectively make up theposition and orientation sensor 13. As mentioned, the subsystems may beoptical, electromagnetic, GPS, inertial or the like.

The or each position subsystem provides an estimate of the position ofits mobile position sensor, mounted to or within the mobile scanner 10,with respect to its assigned reference coordinate system, defined as theposition of CSMp in CSWp. For example, a GPS sensor would provide itsposition with respect to the earth's latitude and longitude coordinatesystem plus altitude. Similarly, the or each orientation subsystemprovides an estimate of the orientation of its mobile orientationsensor, mounted to or within the mobile scanner 10, with respect to itsassigned reference coordinate system, defined as the orientation of CSMoin CSWo. For example, a gyro-based orientation sensor provides itsorientation with respect to the defined inertial frame of reference. Amagnetic compass defines its orientation with respect to earth'smagnetic field. It will be appreciated that some sensors may estimateboth position and orientation. For example, an electromagnetic motiontracking system may be utilised. Such a system often utilises receivers,located on or within the mobile scanner 10, that provide their positionand orientation information relative to a fixed-point reference sourcetransmitting station.

As mentioned, the scanner 10 may employ multiple position andorientation subsystems of different types and the information derivedfrom each can be processed to provide accurate position and orientationinformation pertaining to the mobile scanner 10 in a local referenceframe or coordinate system. In essence the scanner 10 may employ ahybrid position and orientation sensor comprising complementary sensors,which are combined so as to overcome limitations of each. For example,an optical position and orientation subsystem (as described later) canbe devised to provide highly accurate location sensing of a movingobject, but loses its location when reference targets are obscured. Sucha sensor could be combined with an inertial-based position andorientation sensing device to provide dead-reckoning position andorientation sensing during periods when the optical subsystem drops out.

The function or module Fpo transforms the position and orientationestimates from each position and orientation subsystem into a chosencommon world coordinate system CSW. For example, GPS coordinates may betransformed into a local coordinate system with an origin at a certainlocation within a scene, or a gyro-derived orientation estimate may bereferenced to a chosen coordinate reference frame. Fpo requires inputparameters 8 relating each coordinate system to CSW. These parameterscan either be defined explicitly, for example when one of the subsystemreference frames is defined as the world reference frame, or may bederived through a calibration process. The output of Fpo is the pose(position and orientation) of CSM in CSW.

The range sensor 11 produces data referenced to its own local coordinatesystem CSR. Function Fr translates this into CSM using parametersderived from calibrating the fixed range sensor 11 with respect to itsposition on or within the mobile scanner 10 housing. Similarly, thetexture sensor 12 data is translated into CSM using Function FT.

Fr may also be arranged as a data filter. In particular Fr may bearranged to filter range data to remove points that, by reason of theirrelationship to other nearby points sensed by the range sensor, aredeemed to be unwanted. By way of example, typical scenes might notinclude individual or small collections of points that are isolated inspace from other points and may thus be discarded at an early stage. Itwill be appreciated that Fr may be configured to automatically discardor dump data that it deems is unwanted in the context of the scan of thescene. The real-time data filtering capability of Fr increases dataprocessing speeds as it enables the scanner 10 to discard or dumpunwanted data early in the processing scheme.

The data from the position and orientation subsystems, range and texturesensors may not necessarily be coincident in time. Therefore, functionsIp and Io interpolate the data streams enabling the position andorientation to be determined at arbitrary moments in time. FunctionFporT[1] transforms the range and texture data into CSW using the posedata from Fpo. FporT[1] may be manually configured to ignore datarelating to certain areas or regions of the scene as selected by theuser. For example, the user may be able to manually select zones,boundaries or regions of a scene that are to be scanned. There may bemultiple regions and the boundaries can be set in any one or moredimensions in the X, Y or Z planes. By way of example, the user may seta lower Z-plane limit, for example 5 inches above the ground, and anydata scanned that is below that limit is discarded instantly. Further,the user may designate a boundary in the scene in the X-Y plane aroundan object to be scanned in the scene, and any data scanned from outsidethat boundary is discarded instantly. It will be appreciated that anynumber of different forms of scanning boundaries, regions or zones maybe set by the user. Further, there may be multiple scanning zones orboundaries designated within the same scene. It will be appreciated thatthe scanning boundaries, regions and/or zones may be designated in anumber of ways by the user. For example, the user may set the scanningregions and the like via a user interface that displays a real-timerepresentation of the scene, for example captured by a camera.Alternatively, the scanner 10 may be set into a scanning zone selectionmode and may carried around the scene and operated by the user tomark-out the scanning zones to tag the zones and boundaries. It will beappreciated that the scanning zone selection may be achieved in otherways also, i.e. any data that relates to surface points or patchesoutside the selected zones or regions in the local reference frame ofthe scene is instantly disregarded and dumped.

Function FporT[2] translates the data into a recognised standard format.It will be appreciated that the foregoing functional architecture is oneexample of how the information from the range 11, texture 12, andposition and orientation 13 sensors may be processed to produce therich-3D data representing the scan of the scene. Other algorithms anddata processing methods may be implemented to achieve the same result.While the data processing method described is asynchronous, it will beappreciated that data processing may alternatively be implemented in asynchronous manner if desired.

The stored scanner 10 output data is a large data set, preferably in theform of rich-3D data. For example, every surface data point couldcontain its 3D position coordinates in CSW, the corresponding texturesensor values plus the 3D coordinates in CSW of the scanning locationwhere this surface point was sensed from. With the scanner 10, the samesurface point may be sensed multiple times from different viewpoints.This is a desirable feature since multiple instances of the same surfacepoint can be used to resolve shape or texture ambiguities caused, forexample, by noisy data or specular reflections. The scanner 10 generatesa complete surface description of all surfaces and objects in a scene inrich-3D data format.

Referring to FIG. 5, the typical operation of a preferred form 3D scenescanner 10 will be described. The preferred form scanner 10 to bedescribed utilises a hybrid optical position and orientation sensor 13comprising an optical tracking device that tracks reference targetslocated in the scene and an inertial sensor. An example embodiment ofthe optical tracking device and reference targets will be described inmore detail later.

The scene depicted in FIG. 5 is a car crash 18 and the scanner 10 may beutilised to scan the scene, including the exterior and interior of thevehicles involved in the crash, surrounding objects, skid marks on theroad and any other objects or surfaces desired. The data generated bysurveying the scene with the scanner 10 may then later be utilised tocreate evidential 3D photo-realistic representations and 3D computermodels of the crash scene for later analysis by, for example,investigators trying to determine the cause of the crash or the party atfault.

The preferred form portable 3D scanning system, comprising the hand-heldmobile scanner 10, reference targets 20, and laptop 15, may betransported easily to the crash scene by the operator 19 in a carry-caseor the like. To scan the scene, the operator 19 starts by placing anumber of reference targets 20 in suitable locations around the scene.The reference targets 20 provide a framework for the optical trackingdevice of the position and orientation sensor 13 onboard the scanner 10and in the preferred form system the location of the reference targets20 define a local reference frame within which the scanner 10 canoperate. The location of the reference targets 20 can be determinedarbitrarily and randomly by the operator 19. The scanner 10 does notneed to be pre-programmed with the reference target 20 locations as thetarget locations are obtained automatically by the scanner 10 inoperation. The reference targets 20 can be tripod mounted or otherwisemounted in convenient locations, for example they may be placed onobjects in the scene. The number of reference targets 20 is flexible andis dictated by the size and complexity of the scene. The key requirementis optimum visibility from the scanner 10 in operation. Not all thereference targets 20 need to be visible at all times, but to maintainposition and orientation sensing integrity at least three must bevisible at any one point in time.

As previously mentioned, the position and orientation sensor 13 maycomprise a number of position and orientation subsystems to enhanceaccuracy and reliability. Therefore, the optical tracking device may beaugmented with other inertial sensors, such as gyros, accelerometers,inclinometers or the like, to maintain spatial position and orientationsensing capability in the event of target visibility dropout. It will beappreciated that electromagnetic, GPS, or other types of position andorientation sensors or systems may be utilised to supplement the opticaltracking device also. In the preferred form, an inertial sensor isprovided to supplement the optical tracking device if there is targetdropout and/or to enhance the robustness of the pose informationprovided by the optical tracking system. It will be appreciated that theposition and orientation sensor may employ the optical tracking systemalone or any other pose tracking system alone, but it will beappreciated that a hybrid sensor comprising a combination of trackingdevices may be desirable in circumstances where the nature of the sceneto be scanned reduces the effectiveness of a particular type of trackingdevice.

Once the reference targets 20 have been placed about the scene, theoperator 19 can begin scanning the scene by moving the scanner 10 overand around the objects and surfaces to be scanned. The position andorientation sensor 13 of the scanner 10 automatically determines therelative locations of the reference targets 20 and its position andorientation with respect to them. After a while, for example whensufficient calibration information has been estimated, the scanner 10will start displaying a rough visualisation of the scanned scene on anoutput display either on the scanner 10 or an external devicecommunicating with the scanner 10.

There are few restrictions on scanner 10 motion and the same area can bescanned several times from different directions if desired. If theoperator wants a rough picture right away, they can move through theworking volume before capturing any detailed data. Scanning will be aninteractive experience with visual feedback preferably being provided bymeans of a screen on the scanner 10 to aid the operator 19 in assessingareas already scanned. The operator 19 is able to add/remove/movereference targets 20 at any time and the scanner 10 will automaticallyadjust by registering the new locations of reference targets 20. In thepreferred form, the scanner 10 will also diagnose geometricallyill-conditioned reference target 20 configurations and advise theoperator 19 to, for example, add or move a target. In particular, itwill be appreciated that the accuracy of the optical tracking device isdependent on the number of reference targets that are visible and thegeometric relationship of those visible targets relative to each otherin the scene. If the targets are spaced closely together in one regionof the scene, the pose information generated by the optical trackingdevice is likely to be less accurate than if the targets weredistributed more evenly about the scene. In the preferred form, thescanner 10 may be arranged to provide the user with feedback on whetheroptical tracking device can provide accurate pose information for aparticular distribution and spread of targets within a scene. The usermay then add more targets or rearrange the targets in accordance withthat feedback on the quality of the distribution or spread.

As shown, the scanner 10 communicates wirelessly with an external device15, such as a portable laptop, PDA or other mobile computer. Inparticular, the control system 14 onboard the scanner 10 is arranged totransfer scanned data to the external computer 15 for post-processingand storage. In one form, the external computer 15 may perform somecontrol system functions for the scanner 10, such as communicating withreference targets 20 or controlling scanner 10 settings remotely via awireless link. As mentioned, the control system for the scanner 10 canbe partially external to the scanner itself in some embodiments.Furthermore, it will be appreciated that communication between thescanner 10, external devices 15, and reference targets 20 may be viawire if desired.

In the preferred form, the scanner 10 has an onboard user interface thatis operable by an operator 19 to alter various scanner 10 settings, suchas the sensitivity, accuracy, resolution, and range of the scanner 10and its sensors 11,12,13. For example, objects of interest can bescanned at high resolution, whereas background objects, such as roads ata car crash scene, can be scanned at lower resolutions. As previouslymentioned, the scanner 10 may be arranged to allow the user to selectscanning zones and regions within the scene and this may be achieved,for example, via the user interface and an output display showing arepresentation of the scene containing the zones. During the scanning,the user may, via the user interface, change the scanning resolution,for example, to scan some areas in high resolution and other backgroundareas in low resolution. Once the user has configured these settings,the scanner 10 will automatically adjust during the scan of the scene,for example to dump data relating to surface points or patches outsidethe selected scanning zones.

Once the raw scan data has been acquired, post-processing dataconversion and visualisation software can be utilised to allow the datato be useful in any application. It will be appreciated that the scanner10 itself or any external device 15 processing the 3D data scanned mayutilise such software and that the software can be customised to suitparticular applications. For example, conversion software may processthe rich-3D point-cloud data to produce alternative data constructs thatcan be imported into visualisation or CAD software programs.

An example of an optical tracking device utilised in the position andorientation sensor 13 of the scanner 10 will now be explained. Theoptical tracking device is mounted to or within the scanner 10 housingand is arranged to sense the position and orientation of the scanner 10by tracking visible reference targets 20 located in the scene. In oneform, the optical tracking device comprises one or more directionsensors arranged to detect visible reference targets 20 and generatedirection information relating to the direction of the visible referencetargets relative to the scanner 10. Further, the optical tracking devicehas a control system that processes the direction information todetermine the position and orientation of the scanner 10 in space orwith respect to a local reference frame or coordinate system defined bythe location of the reference targets 20.

The optical tracking device comprises one or more direction sensors or asubstantially omnidirectional direction sensor arranged to provideestimates of the direction to visible reference targets 20 in the scene.The omnidirectional direction sensor may comprise an arrangement ofoptical sensors arranged to view outwardly relative to the scanner 10 toprovide direction information relating to any visible reference targets20. The optical sensors may be video or still-shot digital cameras thatutilise electronic image sensors, whether CCD, CMOS or otherwise. Itwill be appreciated that other types of direction sensors could beutilised, such as lateral effect photodiodes or the like.

By way of example, FIG. 6 shows one possible form of an omnidirectionaldirection sensor 21 that may be utilised by the optical tracking device.The sensor 21 is implemented optically using a set of video cameras 22,mounted on a mobile frame, that are arranged to view stationaryreference targets 20 located in the scene to be scanned. For example,six cameras 22, outward pointing and rigidly mounted to or within thescanner 10 as if on each side of a cube, are used as direction sensors.It will be appreciated that more or less cameras may be utilised inother arrangements. Each camera 22 provides estimates of the directionto each reference target 20 visible to that camera. Rather than manuallysurveying the reference target 20 locations in the scene, the opticaltracking system utilises an auto-calibration algorithm to automaticallydetermine and maintain target locations. Once the reference target 20positions are known, it is possible to uniquely identify the positionand orientation (pose) of the group of cameras 22 and hence the scanner10 within the local reference frame defined by the targets 20. It willbe appreciated that the camera rig shown in FIG. 6 may be reduced insize and implemented in alternative ways to reduce the overall size andmobility of the hand-held scanner 10. For example, FIG. 7 shows anotherpossible form of omnidirectional direction sensor 26 that may beutilised by the optical tracking system. The sensor 26 is a cube-likestructure with lens holes 28 in each wall. The structure houses sixoptical sensors that are each arranged to view outwardly through arespective lens hole 28 to detect visible reference targets as describedpreviously. As mentioned, the optical sensors could be electronic imagesensors or lateral effect photodiodes or any other technology that iscapable of acting as a direction sensor to external targets.

Reverting to FIG. 6, each camera 22 is individually calibrated todetermine its intrinsic calibration parameters, for example thoseparameters that characterise the optical, geometric and digitalcharacteristics of the camera/lens combination, including focal length,optical centre of sensor, and pixel height & width. These parametersallow each pixel in the camera 22 to be translated into a ray in spacewith respect to the camera's coordinate system.

The auto-calibration algorithm to determine reference target locationsand the pose tracking algorithm to determine the pose of the scanner 10will now be described in more detail with reference to FIG. 8.

To describe the algorithms, we assume that the optical tracking deviceutilises a set of M calibrated cameras that have a fixed spatialrelationship relative to each other, for example they are rigidlycoupled together. The set of cameras is referred to as the camera group.The cameras are organised to approximate an omnidirectional camera withtheir optical centres as coincident as physically possible, and theimage planes providing maximum coverage with as little overlap aspossible. The geometric relationship between each camera is known bymeans of a group calibration process, so the camera group can becharacterised by a single moving coordinate system denoted CSM. Forexample, CSM may be identified with the camera coordinate system of oneof the cameras. The cameras are synchronised so that a single framecapture event will grab M images. A world coordinate system, denotedCSW, is defined in terms of K stationary reference targets 20 located ina scene 23 (e.g. one target is at the origin, another on the x-axis andanother in the xy-plane).

In both the auto-calibration and pose tracking methods, a set of Mimages of the reference targets 20 is captured (one from each camera) ateach camera group CSM position. Not all the reference targets 20 willnecessarily be visible as the sensors do not give complete coverage andthe targets may also be occluded. Some reference targets 20 may also bevisible in more than one image because of small amounts of sensoroverlap.

A target pixel location is extracted using a sub-pixel estimator thenback-projected through the camera model to get a ray in the cameracoordinate system. The ray is then rotated and translated to provide aline in CSM.

In the auto-calibration method, the camera group CSM is moved to Npositions 24 in CSW and for each position a sequence of images oftargets is acquired with targets selectively activated according to apattern. Analysis of the sequence of images (frame events) yields a setof lines in CSM, corresponding to the set of visible targets, labelledwith their target identifiers.

By way of example only, the camera group may have six outward lookingcameras (M=6) as previously described and there may be 32 referencetargets in the scene. At each of the N positions in the scene the cameragroup may be arranged to rapidly capture 33 frame events (sets of 6images) as the reference targets are selectively activated to sequencethrough a pattern. The pattern may, for example, involve selectivelyactivating one target at a time for the first 32 frame events and thenactivating all 32 targets for the 33^(rd) frame event. As previouslymentioned, analysis of the 33 frame events yields a set of lines in theCSM, corresponding to the set of visible targets, labelled with theirtarget identifiers.

It will be appreciated that various pattern sequences may alternativelybe utilised. For example, binary coded sequences or other codedsequences could be implemented to selectively activate the targets asthe camera group captures frame events at each of the N positions.Further, the number of frame events required at each of the positions isinherently related to the type of sequenced pattern generated by thereference targets.

More efficient pattern sequences will require fewer frame events at eachof the N positions to yield the required set of labelled lines. It willalso be appreciated that the number of reference targets may be variedaccording to the nature of the scene.

Ultimately, the auto-calibration method may work on considerably fewertargets.

Further the number and location of the N positions may vary and can beselected at random.

Once the above process has been carried out, the problem can be statedas: given a set of labelled lines in CSM captured from N positions,recover the position of the K targets in CSW. The solution involvesthree steps:

-   -   (1) Calculate initial estimates of the camera group pose for        n=1, . . . N, using a closed form algorithm;    -   (2) Calculate accurate estimates of the camera group pose by        non-linear minimization; and    -   (3) Reconstruct the target positions in CSW by triangulation        using the set of labelled lines and accurate camera group pose        estimates.

Referring to step (1), this is carried out to avoid ambiguities in thenon-linear minimization that occurs in step (2). In particular, theinitial estimates of the camera group pose at each of the N positionsassist in avoiding local minima of the objective function used inperforming the non-linear minimization to calculate accurate estimatesof the camera group pose. There are various closed form algorithms thatmay be utilised to calculate the initial estimates and each algorithmmakes certain assumptions. By way of example, one closed form algorithmcalculates the initial estimates based on approximating the camera groupby a more idealised omnidirectional camera in which all the opticalcentres are coincident and positioned at the origin of the CSM. Thisallows many standard results to be applied directly and the results maybe expressed in terms of geometric algebra. The algorithm may thenemploy the well known essential transformation to determine the relativepose between each of the N positions. The relative pose may be expressedin terms of direction and orientation, but not distance as a unittranslation is assumed. The essential transformation considers thelabelled lines at each of the N positions of the CSM and obtainsestimates of the relative pose (with unit translation) between thepositions of the CSM. These estimates can then be transformed into CSWto give initial estimates of the relative direction and orientationbetween the positions of the CSM and known distances (yardsticks) can beused to rescale the estimates. For example, yardsticks might be obtainedvia measuring the actual distance between two reference targets.

Referring to step (2) accurate estimates of the camera group CSM pose ateach of the N positions are calculated using a non-linear minimisationalgorithm and the initial estimates calculated in step (1). By way ofexample, an objective (error) function and its gradient is utilised sothat the problem can be formulated as a standard non-linearoptimisation. The non-linear optimisation does not assume the idealisedomnidirectional camera approximation. The lines associated with a giventarget K will nearly intersect. Image noise and quantisation,calibration errors, camera modelling errors etc will prevent them fromintersection exactly. The objective (error) function is utilised tominimise the dispersion of the lines about their nominal intersectionpoint. The objective function, constraints such as yardsticks, thegradients of the objective function, and the initial estimates enablethe problem to be formulated as a standard optimisation with barrierfunctions, or a constrained optimisation. The results of theoptimisation provide accurate estimates of the camera group CSM pose ateach of the N positions in CSW.

Referring to step (3), the positions of the K targets in CSW may then betriangulated using the set of labelled lines and accurate camera grouppose estimates at each of the N positions. This involves mapping thelabelled lines into CSW and then triangulating the lines associated witheach of the K targets. During steps (1)-(3), a temporary assumption ismade that the origin of CSW is identified by one of the N positions ofthe camera group CSM. This can now be dropped and the true CSW can beconstructed out of the targets and the target positions mapped into thisCSW.

As mentioned, once the auto-calibration method has obtained the targetlocations, the scanner can be arranged to diagnose geometricallyill-conditioned reference target configurations and provide feedback tothe user on whether the optical tracking device can provide accuratepose information for a particular distribution and spread of targetswithin a scene. The target location feedback may advise the user to, forexample, add a target or rearrange the targets for a better spreadacross the scene. For example, the target location feedback algorithmmay involve calculating some measure of dispersion of the unit directionvectors from the direction sensors to each target, where a highdispersion represents a good target configuration and a low dispersionindicates a poor target organization.

In the pose tracking method, the camera group CSM is moved continuouslyin CSW and a continuous sequence of target image sets is acquired, eachcontaining M images. For each image set (frame event), the set ofobserved lines in CSM, corresponding to the set of visible targets, canbe determined. The problem can now be stated as: given the set of linesin CSM for each frame capture event, estimate the pose of the cameragroup. The camera group pose estimation is accomplished by repeating thefollowing steps:

-   -   (1) Predict the current pose based on the previous pose        estimates using extrapolation;    -   (2) Associate target labels with each line in CSM; and    -   (3) Update the current pose prediction using the targets and        lines using a non-linear algorithm.

Before steps (1)-(3) are traversed, the pose tracking method implementsan initial boot-strapping process to provide an initial pose estimate.The initial pose estimate is obtained by maintaining the camera groupCSM at an initial position and then acquiring a sequence of images ofthe targets as they are selectively activated according to a pattern. Byway of example, this process may be similar to that carried out by theauto-calibration process at each of the N positions. Analysis of thesequence of images (frame events) yields a set of lines in CSM,corresponding to the set of visible targets, labelled with their targetidentifiers. As the target positions are now known, an initial poseestimate of the camera group can be determined using triangulation.

After the boot-strapping process, steps (1)-(3) are repeated rapidly foreach frame event to provide continuous estimates of the camera group inCSW. As mentioned, during steps (1)-(3) all reference targets areactivated during each frame event and hence the observed lines to eachof the targets in CSM are not labelled.

Referring to step (1), extrapolation is used to predict the current posebased on previous pose estimates. Referring to step (2), the predictedpose and previous set of labelled lines are used to get a prediction ofthe next position of the labelled lines. The observed set of lines,whose identifiers are unknown, are associated with target labels oridentifiers based on the predicted labelled lines determined. This isessentially done by matching the predicted labelled lines with theclosest observed unknown lines. Referring to step (3), the labelled setof observed lines is now utilised to provide a camera group poseestimate and update the current pose prediction using a non-linearalgorithm for the next iteration of steps (1)-(3).

Reverting back to the boot-strapping process, this provides an initialprediction of the current pose for step (1) of the first iteration ofsteps (1)-(3). The boot-strapping process is not utilised after thefirst iteration.

It will be appreciated that all reference targets do not necessarilyhave to be activated for each frame. The pose tracking method couldutilise a patterned sequence of activation in alternative forms of thetracking method.

As the optical tracking device is onboard the scanner 10, the cameragroup pose reflects the position and orientation of the scanner 10.

The preferred form reference targets 20 comprise a powered light sourcethat emits visible electromagnetic radiation for detection by thecameras 22 of the optical tracking device. For example, the lightsources may be LEDs, or more particularly high-power blue LEDs. The LEDsmay be operated to glow continuously or may be pulsed at a particularswitching frequency. In the preferred form, the LEDs of the referencetargets 20 are pulsed in synchronism with the direction sensor camerashutter. This allows a much higher LED intensity than that possible withcontinuous operation and improves LED contrast with ambient since eachcamera is only acquiring light when the LEDs are operating. Further, thedirection sensor cameras may utilise optical filtering to furtherenhance the contrast with ambient. The preferred form LEDs areindividually addressed to provide facility for unique identification bydirection sensors.

It will be appreciated that other types of reference targets may beutilised if desired. Essentially, the only limitation is that thereference target must have a characteristic that is distinguishable anddetectable by the optical tracking device. For example, the referencetargets may comprise any mobile powered or non-powered object, device,beacon, marker, pattern, landmark or combination thereof.

It will be appreciated that the optical position and orientation system,comprising the optical tracking device and reference targets, may beutilised in other applications to sense the position and orientation ofany mobile object. The position and orientation system is not limited toscene scanning applications and may be utilised as a 6D optical trackingsystem in its own right in other applications. The optical position andorientation system works on an inside-out basis where the opticaltracking device is mounted to or within the object to be monitored andthis arrangement is particularly suited to certain applications. Theposition and orientation system could be integrated with other systemsand may comprise its own user interface and output display fordisplaying position and orientation information. The control systemonboard the optical tracking device may also be arranged to communicatewith external devices, via input/output modules, to transfer data. Thepreferred form optical position and orientation system is capable ofworking within a vast range of different sized areas, whether indoor oroutdoor. For example, the optical tracking device may detect targets upto 50 m away, but it will be appreciated that the range and accuracy ofthe optical tracking device can be extended by utilising differentcomponents if desired. In addition, the optical position and orientationsystem may be supplemented with additional electromagnetic, GPS, orinertial sensors to enhance robustness or to aid in target drop outsituations. Such hybrid configurations of the position and orientationsystem will be advantageous in certain environments.

Summary of the 3D Scene Scanner Functionality

The 3D scene scanner is a photo-realistic wide-area flexible 3D scanningdevice, which provides free-form scanning of large and complex-shapedobjects or scenes over wide areas, which provides both 3D metric andsurface texture data and which works on a wide range of materialsincluding ferrous materials. The scanner embodies a sophisticated andcomplex integration of sensors, electronic systems and computervision/geometric algorithms which together enable the acquisition of 3Dphoto-realistic representations and 3D computer models of scenes andobjects. The flexible hand-held scanner can capture accuratephoto-realistic 3D data over wide areas, different sized scenes, anddifferent sized objects. The scanner comprises the integration ofposition and orientation sensing, range sensing and texture sensingtechnologies to enable the acquisition of complete photo-realistic 3Dscene and object models.

The 3D scene scanner is hand-held and this enables a vast variety ofcomplex objects or scenes to be scanned. Objects of any shape can bescanned, in the amount of detail needed, inside and outside, at anyangle and from any direction. The only proviso is that the position andorientation sensor must maintain positional integrity during thescanning process. For the optical tracking device, this means at leastthree reference targets must be viewable by at least one of thedirection sensor cameras. However, in the preferred form 3D scenescanner the position and orientation sensor will be a hybrid of theoptical tracking device and an inertial sensor or other supplementarysensors or tracking devices that maintain pose estimates if the opticaltracking device drops out.

The 3D scene scanner has a user interface that is operable by a user toactivate different functions and alter settings as desired. For example,it is possible to register regions of interest enabling a user to scantwo or more regions in the same scene in detail and little in-between,yet maintain the exact spatial relationships between the scannedregions. The scanner also provides the user with detail control. Forexample, a user can vary the scanning density to obtain highly accuratedetail where it is needed and coarse sampling where detail is lessimportant, all within the same scene. The scanner produces rich-3D data,which is a data structure that contains all the multiple directions fromwhich each 3D point has been surveyed, as well as the captured imagedata at that point from multiple directions. The rich-3D data capturedby the scanner opens up many post-processing possibilities forsuper-realistic scene visualization. The scanner is very flexible inthat it is easy to transport, to set up and to use immediately in almostany environment, with all on-site calibration happening automatically.Also the inside-out optical position and orientation sensor utilised inthe preferred form scanner enables unconstrained reference targetlocation and automatic self-calibration of the position of these.

While the preferred form 3D scene scanner has been described as ahand-held mobile scanning device, it will be appreciated that thescanner may be arranged to be mounted to a vehicle, aircraft or anyother mobile platform for scanning of larger scenes, for examplecityscapes or landscapes. The 3D scene scanner is a local range andimage scanner that always knows its position and orientation in space.By combining range and image data with knowledge of the position fromwhich these are obtained, the scanner is able to generate data that canbe processed to reconstruct a spatially accurate and photo-realisticrepresentation of a scanned environment. As mentioned, the scanner maybe provided in a hand-held form or it is portable in that is can bemounted to a mobile platform, such as a vehicle, aircraft, robotscanning mechanism or the like.

It will be appreciated that the concepts underlying the scanner areextendible and may be implemented with various technologies. Forexample, the position and orientation sensing technology could bechanged or upgraded without compromising the overall design integrity orthe 1D laser range sensor could be swapped for another using a different3D sensing principle.

APPLICATIONS

The 3D scene scanner and rich-3D data it generates may be utilised in amyriad of applications. For example, photorealistic 3D visualisation isbecoming an important requirement of applications as diverse as surgicaltraining and digital special effects.

The rich-3D data generated by the scanner contains enormous amounts ofvisual and spatial information about scenes and objects. This presents asignificant opportunity to develop new ways of acquiring, representingand rendering such rich-3D data. By way of example, potentialapplications of the scanner include:

-   -   Rapid scanning of accident sites and crime scenes for evidential        recording and subsequent analysis;    -   Scanning 3D models, actors and sets for movies and computer        games;    -   As-built surveying and design verification of large structures,        such as aircraft, boat hulls or buildings;    -   Scanning of large 3D objects or assemblies, for example aircraft        landing gear, for use in computer-based models used in        applications such as online training or maintenance manuals;    -   3D scanning for archiving and subsequent virtual display of        works of art such as statues and sculptures;    -   Terrain capture of archaeological sites; and    -   Scanning people for accurate modelling, fitting and        visualisation of custom designed clothing.

The foregoing description of the invention includes preferred formsthereof. Modifications may be made thereto without departing from thescope of the invention as defined by the accompanying claims.

1. A hand-held mobile 3D scanner for scanning a scene comprising: arange sensor that is arranged to sense the location of surface points inthe scene relative to the scanner and generate representative locationinformation; a texture sensor that is arranged to sense the texture ofeach surface point in the scan of the scene and generate representativetexture information; a position and orientation sensor that is arrangedto sense the position and orientation of the scanner during the scan ofthe scene by interacting with multiple reference targets located inrandom positions about the scene and generate representative positionand orientation information; and a control system that is arranged toreceive the information from each of the sensors and generate datarepresenting the scan of the scene, wherein the positions of thereference targets are unknown to the control system at start-up of thecontrol system.
 2. A hand-held mobile 3D scanner according to claim 1wherein the data generated relates to scanned surface points in thescene and comprises information on the 3D positions of those surfacepoints in space and texture information in relation to those surfacepoints.
 3. A hand-held mobile 3D scanner according to claim 2 whereinthe data further comprises viewpoint information in relation to theviewpoint from which the surface points were scanned by the scanner. 4.A hand-held mobile 3D scanner according to claim 2 wherein the controlsystem is arranged to generate the texture information for the databased on texture values sensed by the texture sensor from multipleviewpoints during the scan.
 5. A hand-held mobile 3D scanner accordingto claim 1 wherein the control system is arranged to generate a texturemodel representing the scan of the scene.
 6. A hand-held mobile 3Dscanner according to claim 1 wherein the data generated by the controlsystem is in the form of rich-3D data.
 7. A hand-held mobile 3D scanneraccording to claim 1 wherein the control system is arranged to generatea 3D substantially photo-realistic representation of the scene from thedata.
 8. A hand-held mobile 3D scanner according to claim 1 wherein thecontrol system comprises a user interface that is operable by a user tocontrol scanning parameters.
 9. A hand-held mobile 3D scanner accordingto claim 1 wherein the control system comprises an output display thatis arranged to generate a progressive representation of the scene as itis being scanned.
 10. A hand-held mobile 3D scanner according to claim 1wherein the control system is arranged to filter out data associatedwith scanned surface points in the scene that fall outside scanningzones that are selected by the user.
 11. A hand-held mobile 3D scanneraccording to claim 1 wherein the control system is arranged to increaseor decrease the resolution of the range and texture sensors forparticular scanning zones that are selected by the user.
 12. A hand-heldmobile 3D scanner according to claim 1 wherein the range sensorcomprises any one of the following: a light detection and ranging(LIDAR) device, triangulation-based device, or a non-scanningtime-of-flight camera device.
 13. A hand-held mobile 3D scanneraccording to claim 1 wherein the texture sensor comprises a colourcamera that is arranged to capture digital images of the scene, eachdigital image comprising an array of pixels and each pixel or group ofpixels corresponding to a surface point in the scan of the scene fromwhich texture information can be extracted.
 14. A hand-held mobile 3Dscanner according to claim 1 wherein the texture sensor comprises amulti-spectral laser imager that is arranged to sense textureinformation relating to the scanned surface points of the scene.
 15. Ahand-held mobile 3D scanner according to claim 1 wherein the positionand orientation sensor comprises an optical tracking device that sensesthe position and orientation of the scanner by tracking visiblereference targets located about the scene.
 16. A hand-held mobile 3Dscanner according to claim 15 wherein the optical tracking devicecomprises one or more direction sensors that are arranged to detectvisible reference targets and generate direction information relating tothe direction of the visible reference targets relative to the scanner,the optical tracking device processing the direction information todetermine the position and orientation of the scanner.
 17. A hand-heldmobile 3D scanner according to claim 16 wherein the direction sensorsare optical sensors that are each arranged to view outwardly relative tothe scanner to provide direction information relating to any visiblereference targets.
 18. A hand-held mobile 3D scanner according to claim15 wherein the position and orientation sensor additionally comprises aninertial sensor that is arranged to sense the position and orientationof the scanner and provide representative position and orientationinformation if the optical tracking device experiences target dropout.19. A portable 3D scanning system for scanning a scene comprising: ahand-held mobile scanner comprising: a range sensor that is arranged tosense the location of surface points in the scene relative to thescanner and generate representative location information; a texturesensor that is arranged to sense the texture of each surface point inthe scan of the scene and generate representative texture information;and a position and orientation sensor that is arranged to sense theposition and orientation of the scanner during the scan of the scene andgenerate representative position and orientation information; multiplereference targets for placing randomly about the scene, the position andorientation sensor interacting with detectable reference targets tosense the position and orientation of the scanner; and a control systemthat is arranged to control the scanner and its sensors and thereference targets, receive the information from each of the sensors, andgenerate data representing the scan of the scene, wherein the positionsof the reference targets are unknown to the control system at start-upof the control system.
 20. A method of scanning a scene comprising thesteps of: operating a hand-held mobile scanner to scan the scene, thescanner comprising: a range sensor that is arranged to sense the shapeof the object(s) in the scene on a surface point-by-point basis andgenerate representative shape information; a texture sensor that isarranged to sense the texture of the object(s) in the scene on a surfacepoint-by-point basis and generate representative texture information;and a position and orientation sensor that is arranged to sense theposition and orientation of the scanner in a local reference frame byinteracting with multiple reference targets located in random positionsabout the scene and generate representative position and orientationinformation, the positions of the reference targets being unknown to thecontrol system at start-up of the control system; obtaining the shape,texture, and position and orientation information from the sensors;processing the shape, texture, and position and orientation information;and generating data representing the scan of the scene.